Improving latency tolerance of multithreading through decoupling
نویسندگان
چکیده
منابع مشابه
Improving Latency Tolerance of Multithreading through Decoupling
ÐThe increasing hardware complexity of dynamically scheduled superscalar processors may compromise the scalability of this organization to make an efficient use of future increases in transistor budget. SMT processors, designed over a superscalar core, are therefore directly concerned by this problem. This work presents and evaluates a novel processor microarchitecture which combines two paradi...
متن کاملImproving Latency Tolerance of Network Processors Through Simultaneous Multithreading
Existing multithreaded network processors architecture with multiple processing engines (PEs), aims at taking advantage of blocked multithreading technique which executes instructions of different user-defined threads in the same PE pipeline, in explicit and interleave way. Multiple PEs, each of which is a multithreaded processor core, process several packets in parallel to hide long memory acc...
متن کاملLatency Tolerance through Multithreading in Large-Scale Multiprocessors
In large-scale distributed-memory multiprocessors, remote memory accesses su er signi cant latencies. Caches help alleviate the memory latency problem by maintaining local copies of frequently used data. However, they cannot eliminate the latency caused by rst-time references and invalidations needed to enforce cache coherence. Multithreaded processors tolerate such latencies by rapidly switchi...
متن کاملPer-Node Multithreading and Remote Latency
This paper evaluates the use of per-node multi-threading to hide remote memory and synchronization latencies in software DSMs. As with hardware systems, multi-threading in software systems can be used to reduce the costs of remote requests by running other threads when the current thread
متن کاملCompiler Generated Multithreading to Alleviate Memory Latency
Since the era of vector and pipelined computing, the computational speed is limited by the memory access time. Faster caches and more cache levels are used to bridge the growing gap between the memory and processor speeds. With the advent of multithreaded processors, it becomes feasible to concurrently fetch data and compute in two cooperating threads. A technique is presented to generate these...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Computers
سال: 2001
ISSN: 0018-9340
DOI: 10.1109/12.956093